<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
  <meta http-equiv="Content-Type" content="text/html; charset=utf-8"/>
  <title>tango.text.Regex</title>
  <link href="./css/style.css" rel="stylesheet" type="text/css"/>
  <!-- <link href="./img/icon.png" rel="icon" type="image/png"/> -->
  <script type="text/javascript" src="./js/jquery.js"></script>
  <script type="text/javascript" src="./js/modules.js"></script>
  <script type="text/javascript" src="./js/quicksearch.js"></script>
  <script type="text/javascript" src="./js/navigation.js"></script>
  <!--<script type="text/javascript" src="./js/jquery.treeview.js"></script>-->
  <script type="text/javascript">
    var g_moduleFQN = "tango.text.Regex";
  </script>
  
</head>
<body>
<div id="content">
  <h1><a href="./htmlsrc/tango.text.Regex.html" class="symbol">tango.text.Regex</a></h1>
  
<p class="sec_header">License:</p>BSD style: see <a href="http://www.dsource.org/projects/tango/wiki/LibraryLicense">license.txt</a>
<p class="sec_header">Version:</p>Initial release: Jan 2008
<p class="sec_header">Authors:</p>Jascha Wetzel
<p class="bl"/>
    This is a regular expression compiler and interpreter based on the Tagged NFA/DFA method.
<p class="bl"/>
    See <a href="http://en.wikipedia.org/wiki/Regular_expression">Wikpedia's article on regular expressions</a>
    for details on regular expressions in general.
<p class="bl"/>
    The used method implies, that the expressions are <i>regular</i>, in the way language theory defines it,
    as opposed to what &quot;regular expression&quot; means in most implementations
    (e.g. PCRE or those from the standard libraries of Perl, Java or Python).
    The advantage of this method is it's performance, it's disadvantage is the inability to realize some features
    that Perl-like regular expressions have (e.g. back-references).
    See <a href="http://swtch.com/~rsc/regexp/regexp1.html">&quot;Regular Expression Matching Can Be Simple And Fast&quot;</a>
    for details on the differences.
<p class="bl"/>
    The time for matching a regular expression against an input string of length N is in O(M*N), where M depends on the
    number of matching brackets and the complexity of the expression. That is, M is constant wrt. the input
    and therefore matching is a linear-time process.
<p class="bl"/>
    The syntax of a regular expressions is as follows.
    <i>X</i> and <i>Y</i> stand for an arbitrary regular expression.
<p class="bl"/>
    <table border=1 cellspacing=0 cellpadding=5>
    <caption>Operators</caption>
    <tr><td>X|Y</td> <td>alternation, i.e. X or Y</td> </tr>
    <tr><td>(X)</td> <td>matching brackets - creates a sub-match</td> </tr>
    <tr><td>(?X)</td> <td>non-matching brackets - only groups X, no sub-match is created</td> </tr>
    <tr><td>[Z]</td> <td>character class specification, Z is a string of characters or character ranges, e.g. [a-zA-Z0-9_.\-]</td> </tr>
    <tr><td>[^Z]</td> <td>negated character class specification</td> </tr>
    <tr><td>&lt;X</td> <td>lookbehind, X may be a single character or a character class</td> </tr>
    <tr><td>&gt;X</td> <td>lookahead, X may be a single character or a character class</td> </tr>
    <tr><td>^</td> <td>start of input or start of line</td> </tr>
    <tr><td>$</td> <td>end of input or end of line</td> </tr>
    <tr><td>\b</td> <td>start or end of word, equals (?&lt;\s&gt;\S|&lt;\S&gt;\s)</td> </tr>
    <tr><td>\B</td> <td>opposite of \b, equals (?&lt;\S&gt;\S|&lt;\s&gt;\s)</td> </tr>
    </table>
<p class="bl"/>
    <table border=1 cellspacing=0 cellpadding=5>
    <caption>Quantifiers</caption>
    <tr><td>X?</td> <td>zero or one</td> </tr>
    <tr><td>X*</td> <td>zero or more</td> </tr>
    <tr><td>X+</td> <td>one or more</td> </tr>
    <tr><td>X{n,m}</td> <td>at least n, at most m instances of X.<br>If n is missing, it's set to 0.<br>If m is missing, it is set to infinity.</td> </tr>
    <tr><td>X??</td> <td>non-greedy version of the above operators</td> </tr>
    <tr><td>X*?</td> <td>see above</td> </tr>
    <tr><td>X+?</td> <td>see above</td> </tr>
    <tr><td>X{n,m}?</td> <td>see above</td> </tr>
    </table>
<p class="bl"/>
    <table border=1 cellspacing=0 cellpadding=5>
    <caption>Pre-defined character classes</caption>
    <tr><td>.</td> <td>any printable character</td> </tr>
    <tr><td>\s</td> <td>whitespace</td> </tr>
    <tr><td>\S</td> <td>non-whitespace</td> </tr>
    <tr><td>\w</td> <td>alpha-numeric characters or underscore</td> </tr>
    <tr><td>\W</td> <td>opposite of \w</td> </tr>
    <tr><td>\d</td> <td>digits</td> </tr>
    <tr><td>\D</td> <td>non-digit</td> </tr>
    </table>
<dl>
<dt class="decl">class <a class="symbol _class" name="RegExpT" href="./htmlsrc/tango.text.Regex.html#L3598" kind="class" beg="3598" end="4319">RegExpT</a><span class="tparams">(char_t)</span>; <a title="Permalink to this symbol" href="#RegExpT" class="symlink">¶</a><a title="Go to the HTML source file" class="srclink" href="./htmlsrc/tango.text.Regex.html#L3598">#</a></dt>
<dd class="ddef">
<div class="summary">Regular expression compiler and interpreter.</div>
<dl>
<dt class="decl"><a class="symbol _ctor" name="RegExpT.this" href="./htmlsrc/tango.text.Regex.html#L3617" kind="ctor" beg="3617" end="3620">this</a><span class="params">(char_t[] <em>pattern</em>, char_t[] <em>attributes</em> = null)</span>; <a title="Permalink to this symbol" href="#RegExpT.this" class="symlink">¶</a><a title="Go to the HTML source file" class="srclink" href="./htmlsrc/tango.text.Regex.html#L3617">#</a></dt>
<dt class="decl"><a class="symbol _ctor" name="RegExpT.this:2" href="./htmlsrc/tango.text.Regex.html#L3623" kind="ctor" beg="3623" end="3643">this</a><span class="params">(char_t[] <em>pattern</em>, bool <em>swapMBS</em>, bool <em>unanchored</em>, bool <em>printNFA</em> = false)</span>; <a title="Permalink to this symbol" href="#RegExpT.this:2" class="symlink">¶</a><a title="Go to the HTML source file" class="srclink" href="./htmlsrc/tango.text.Regex.html#L3623">#</a></dt>
<dd class="ddef">
<div class="summary">Construct a RegExpT object.</div>
<p class="sec_header">Params:</p>
<table class="params">
<tr><td><em>pattern</em></td><td>Regular expression.</td></tr></table>
<p class="sec_header">Throws:</p>RegExpException if there are any compilation errors.
<p class="sec_header">Example:</p>Declare two variables and assign to them a Regex object:
            <pre class="d_code">
<span class="k">auto</span> <span class="i">r</span> = <span class="k">new</span> <span class="i">Regex</span>(<span class="sl">"pattern"</span>);
<span class="k">auto</span> <span class="i">s</span> = <span class="k">new</span> <span class="i">Regex</span>(<span class="sl">r"p[1-5]\s*"</span>);
</pre></dd>
<dt class="decl">RegExpT!(char_t) <a class="symbol _function" name="RegExpT.opCall" href="./htmlsrc/tango.text.Regex.html#L3657" kind="function" beg="3657" end="3660">opCall</a><span class="params">(char_t[] <em>pattern</em>, char_t[] <em>attributes</em> = null)</span>; <span class="attrs">[<span class="stc">static</span>]</span> <a title="Permalink to this symbol" href="#RegExpT.opCall" class="symlink">¶</a><a title="Go to the HTML source file" class="srclink" href="./htmlsrc/tango.text.Regex.html#L3657">#</a></dt>
<dd class="ddef">
<div class="summary">Generate instance of Regex.</div>
<p class="sec_header">Params:</p>
<table class="params">
<tr><td><em>pattern</em></td><td>Regular expression.</td></tr></table>
<p class="sec_header">Throws:</p>RegExpException if there are any compilation errors.
<p class="sec_header">Example:</p>Declare two variables and assign to them a Regex object:
            <pre class="d_code">
<span class="k">auto</span> <span class="i">r</span> = <span class="i">Regex</span>(<span class="sl">"pattern"</span>);
<span class="k">auto</span> <span class="i">s</span> = <span class="i">Regex</span>(<span class="sl">r"p[1-5]\s*"</span>);
</pre></dd>
<dt class="decl">RegExpT!(char_t) <a class="symbol _function" name="RegExpT.search" href="./htmlsrc/tango.text.Regex.html#L3682" kind="function" beg="3682" end="3688">search</a><span class="params">(char_t[] <em>input</em>)</span>; <span class="attrs">[<span class="prot">public</span>]</span> <a title="Permalink to this symbol" href="#RegExpT.search" class="symlink">¶</a><a title="Go to the HTML source file" class="srclink" href="./htmlsrc/tango.text.Regex.html#L3682">#</a></dt>
<dt class="decl">int <a class="symbol _function" name="RegExpT.opApply" href="./htmlsrc/tango.text.Regex.html#L3691" kind="function" beg="3691" end="3697">opApply</a><span class="params">(int delegate(inout RegExpT!(char_t)) <em>dg</em>)</span>; <span class="attrs">[<span class="prot">public</span>]</span> <a title="Permalink to this symbol" href="#RegExpT.opApply" class="symlink">¶</a><a title="Go to the HTML source file" class="srclink" href="./htmlsrc/tango.text.Regex.html#L3691">#</a></dt>
<dd class="ddef">
<div class="summary">Set up for start of foreach loop.</div>
<p class="sec_header">Returns:</p>Instance of RegExpT set up to search input.
<p class="sec_header">Example:</p><pre class="d_code">
<span class="k">import</span> <span class="i">tango</span>.<span class="i">io</span>.<span class="i">Stdout</span>;
<span class="k">import</span> <span class="i">tango</span>.<span class="i">text</span>.<span class="i">Regex</span>;

<span class="k">void</span> <span class="i">main</span>()
{
    <span class="k">foreach</span>(<span class="i">m</span>; <span class="i">Regex</span>(<span class="sl">"ab"</span>).<span class="i">search</span>(<span class="sl">"qwerabcabcababqwer"</span>))
        <span class="i">Stdout</span>.<span class="i">formatln</span>(<span class="sl">"{}[{}]{}"</span>, <span class="i">m</span>.<span class="i">pre</span>, <span class="i">m</span>.<span class="i">match</span>(<span class="n">0</span>), <span class="i">m</span>.<span class="i">post</span>);
}
<span class="lc">// Prints:</span>
<span class="lc">// qwer[ab]cabcababqwer</span>
<span class="lc">// qwerabc[ab]cababqwer</span>
<span class="lc">// qwerabcabc[ab]abqwer</span>
<span class="lc">// qwerabcabcab[ab]qwer</span>
</pre></dd>
<dt class="decl">bool <a class="symbol _function" name="RegExpT.test" href="./htmlsrc/tango.text.Regex.html#L3703" kind="function" beg="3703" end="3709">test</a><span class="params">(char_t[] <em>input</em>)</span>; <a title="Permalink to this symbol" href="#RegExpT.test" class="symlink">¶</a><a title="Go to the HTML source file" class="srclink" href="./htmlsrc/tango.text.Regex.html#L3703">#</a></dt>
<dd class="ddef">
<div class="summary">Search input for match.</div>
<p class="sec_header">Returns:</p>false for no match, true for match</dd>
<dt class="decl">bool <a class="symbol _function" name="RegExpT.test:2" href="./htmlsrc/tango.text.Regex.html#L3715" kind="function" beg="3715" end="3900">test</a><span class="params">()</span>; <a title="Permalink to this symbol" href="#RegExpT.test:2" class="symlink">¶</a><a title="Go to the HTML source file" class="srclink" href="./htmlsrc/tango.text.Regex.html#L3715">#</a></dt>
<dd class="ddef">
<div class="summary">Pick up where last test(input) or test() left off, and search again.</div>
<p class="sec_header">Returns:</p>false for no match, true for match</dd>
<dt class="decl">char_t[] <a class="symbol _function" name="RegExpT.match" href="./htmlsrc/tango.text.Regex.html#L3909" kind="function" beg="3909" end="3918">match</a><span class="params">(uint <em>index</em>)</span>; <a title="Permalink to this symbol" href="#RegExpT.match" class="symlink">¶</a><a title="Go to the HTML source file" class="srclink" href="./htmlsrc/tango.text.Regex.html#L3909">#</a></dt>
<dt class="decl">char_t[] <a class="symbol _function" name="RegExpT.opIndex" href="./htmlsrc/tango.text.Regex.html#L3921" kind="function" beg="3921" end="3924">opIndex</a><span class="params">(uint <em>index</em>)</span>; <a title="Permalink to this symbol" href="#RegExpT.opIndex" class="symlink">¶</a><a title="Go to the HTML source file" class="srclink" href="./htmlsrc/tango.text.Regex.html#L3921">#</a></dt>
<dd class="ddef">
<div class="summary">Return submatch with the given index.</div>
<p class="sec_header">Params:</p>
<table class="params">
<tr><td><em>index</em></td><td>index   index = 0 returns whole match, index > 0 returns submatch of bracket #index</td></tr></table>
<p class="sec_header">Returns:</p>Slice of input for the requested submatch, or null if no such submatch exists.</dd>
<dt class="decl">char_t[] <a class="symbol _function" name="RegExpT.pre" href="./htmlsrc/tango.text.Regex.html#L3930" kind="function" beg="3930" end="3936">pre</a><span class="params">()</span>; <a title="Permalink to this symbol" href="#RegExpT.pre" class="symlink">¶</a><a title="Go to the HTML source file" class="srclink" href="./htmlsrc/tango.text.Regex.html#L3930">#</a></dt>
<dd class="ddef">
<div class="summary">Return the slice of the input that precedes the matched substring.
        If no match was found, null is returned.</div></dd>
<dt class="decl">char_t[] <a class="symbol _function" name="RegExpT.post" href="./htmlsrc/tango.text.Regex.html#L3942" kind="function" beg="3942" end="3947">post</a><span class="params">()</span>; <a title="Permalink to this symbol" href="#RegExpT.post" class="symlink">¶</a><a title="Go to the HTML source file" class="srclink" href="./htmlsrc/tango.text.Regex.html#L3942">#</a></dt>
<dd class="ddef">
<div class="summary">Return the slice of the input that follows the matched substring.
        If no match was found, the whole slice of the input that was processed in the last test.</div></dd>
<dt class="decl">char_t[][] <a class="symbol _function" name="RegExpT.split" href="./htmlsrc/tango.text.Regex.html#L3968" kind="function" beg="3968" end="3986">split</a><span class="params">(char_t[] <em>input</em>)</span>; <a title="Permalink to this symbol" href="#RegExpT.split" class="symlink">¶</a><a title="Go to the HTML source file" class="srclink" href="./htmlsrc/tango.text.Regex.html#L3968">#</a></dt>
<dd class="ddef">
<div class="summary">Splits the input at the matches of this regular expression into an array of slices.</div>
<p class="sec_header">Example:</p><pre class="d_code">
<span class="k">import</span> <span class="i">tango</span>.<span class="i">io</span>.<span class="i">Stdout</span>;
<span class="k">import</span> <span class="i">tango</span>.<span class="i">text</span>.<span class="i">Regex</span>;

<span class="k">void</span> <span class="i">main</span>()
{
    <span class="k">auto</span> <span class="i">strs</span> = <span class="i">Regex</span>(<span class="sl">"ab"</span>).<span class="i">split</span>(<span class="sl">"abcabcababqwer"</span>);
    <span class="k">foreach</span>( <span class="i">s</span>; <span class="i">strs</span> )
        <span class="i">Stdout</span>.<span class="i">formatln</span>(<span class="sl">"{}"</span>, <span class="i">s</span>);
}
<span class="lc">// Prints:</span>
<span class="lc">// c</span>
<span class="lc">// c</span>
<span class="lc">// qwer</span>
</pre></dd>
<dt class="decl">char_t[] <a class="symbol _function" name="RegExpT.replaceAll" href="./htmlsrc/tango.text.Regex.html#L3991" kind="function" beg="3991" end="4008">replaceAll</a><span class="params">(char_t[] <em>input</em>, char_t[] <em>replacement</em>, char_t[] <em>output_buffer</em> = null)</span>; <a title="Permalink to this symbol" href="#RegExpT.replaceAll" class="symlink">¶</a><a title="Go to the HTML source file" class="srclink" href="./htmlsrc/tango.text.Regex.html#L3991">#</a></dt>
<dd class="ddef">
<div class="summary">Returns a copy of the input with all matches replaced by replacement.</div></dd>
<dt class="decl">char_t[] <a class="symbol _function" name="RegExpT.replaceLast" href="./htmlsrc/tango.text.Regex.html#L4013" kind="function" beg="4013" end="4034">replaceLast</a><span class="params">(char_t[] <em>input</em>, char_t[] <em>replacement</em>, char_t[] <em>output_buffer</em> = null)</span>; <a title="Permalink to this symbol" href="#RegExpT.replaceLast" class="symlink">¶</a><a title="Go to the HTML source file" class="srclink" href="./htmlsrc/tango.text.Regex.html#L4013">#</a></dt>
<dd class="ddef">
<div class="summary">Returns a copy of the input with the last match replaced by replacement.</div></dd>
<dt class="decl">char_t[] <a class="symbol _function" name="RegExpT.replaceFirst" href="./htmlsrc/tango.text.Regex.html#L4039" kind="function" beg="4039" end="4056">replaceFirst</a><span class="params">(char_t[] <em>input</em>, char_t[] <em>replacement</em>, char_t[] <em>output_buffer</em> = null)</span>; <a title="Permalink to this symbol" href="#RegExpT.replaceFirst" class="symlink">¶</a><a title="Go to the HTML source file" class="srclink" href="./htmlsrc/tango.text.Regex.html#L4039">#</a></dt>
<dd class="ddef">
<div class="summary">Returns a copy of the input with the first match replaced by replacement.</div></dd>
<dt class="decl">char_t[] <a class="symbol _function" name="RegExpT.replaceAll:2" href="./htmlsrc/tango.text.Regex.html#L4061" kind="function" beg="4061" end="4079">replaceAll</a><span class="params">(char_t[] <em>input</em>, char_t[] delegate(RegExpT!(char_t)) <em>dg</em>, char_t[] <em>output_buffer</em> = null)</span>; <a title="Permalink to this symbol" href="#RegExpT.replaceAll:2" class="symlink">¶</a><a title="Go to the HTML source file" class="srclink" href="./htmlsrc/tango.text.Regex.html#L4061">#</a></dt>
<dd class="ddef">
<div class="summary">Calls dg for each match and replaces it with dg's return value.</div></dd>
<dt class="decl">string <a class="symbol _function" name="RegExpT.compileToD" href="./htmlsrc/tango.text.Regex.html#L4085" kind="function" beg="4085" end="4290">compileToD</a><span class="params">(string <em>func_name</em> = "match", bool <em>lexer</em> = false)</span>; <a title="Permalink to this symbol" href="#RegExpT.compileToD" class="symlink">¶</a><a title="Go to the HTML source file" class="srclink" href="./htmlsrc/tango.text.Regex.html#L4085">#</a></dt>
<dd class="ddef">
<div class="summary">Compiles the regular expression to D code.</div></dd></dl></dd></dl>
</div>
<div id="footer">
  <p>Copyright (c) 2007-2008 Jascha Wetzel. All rights reserved.</p>
  <p>Page generated by <a href="http://code.google.com/p/dil">dil</a> on Fri Dec 26 04:04:39 2008. Rendered by <a href="http://code.google.com/p/dil/wiki/Kandil">kandil</a>.</p>
</div>
</body>
</html>